Message announcements

ABSTRACT

An announcement thread addressing format which comprises a first sub-part concatenated with a second sub-part is described. The first sub-part is preferably the address of the party which generates the addressing identifier, whereas the second sub-part may be random data. An announcer apparatus may then use these address formats by including only those parts of an announcement thread address which render the address unique within the particular index message in which it is to be included, but not necessarily globally unique.

TECHNICAL FIELD

The present invention relates to an announcement method and system foruse in a publish-subscribe architecture. The present invention alsorelates to a method and apparatus for allocating an identifier to asequence of messages, and in particular to such methods and apparatuswhich are suitable for use in publish-subscribe architectures.

BACKGROUND TO THE PRESENT INVENTION AND PRIOR ART

Publish-Subscribe technologies are known in the art which allow users tomonitor for information and the like by listening to known informationchannels. In our earlier published International patent application No.WO01/99348 we describe a publish-subscribe architecture we term theGeneric Announcement Protocol (“GAP”), wherein messages relating to adefined subject are transmitted over communications channels which arelistened to by listener applications. That is, GAP, andpublish-subscribe technologies more generally, allow users to createchannels that relate to a ‘subject’, which we generalise here to a‘sequence of object versions’, which we will term a ‘thread’. Usuallycurrent approaches such as TIBCO TIBnet or Talarian SmartSockets (seehttp://www.talarian.com/industry/middleware/whitepaper.pdf) usehierarchical naming trees to identify channels. The hierarchical namingapproach does at least ensure each identifier is unique across all thecontexts in which any of the object versions may appear, which is animportant requirement. But there is also a problem in that thetechnology must also manage change of how people name subjects (e.g.company names change). With hierarchical naming, a change at any levelin the hierarchy is disastrous for all system lower in the hierarchy,because they are usually widely distributed.

A further problem with current approaches is that the name hierarchyalso defines the authority to create new names. With current solutions,each enterprise has created its own top for its own hierarchy. However,the way these naming hierarchies have been designed makes them difficultto extend upwards, rather than downwards, leading to difficultydistributing naming hierarchies effectively across enterpriseboundaries. Thus current systems are practically limited to deploymentwithin one enterprise. Although pairs of enterprises can work out waysto share a hierarchy and manage new subject creation, this is notscalable to many, changing, arbitrary relationships between enterprises.It only works well if each merger was planned from the start. Alsocurrent approaches are designed so that new channels are created bysystem administrators for an enterprise, not just any user within theenterprise. Because many low-level relationships can exist betweenenterprises, channel creation is not efficient to control from onedepartment in each enterprise, leading to frustration when what shouldbe purely administrative steps are used as an opportunity to exertpolitical/commercial controls. Current approaches also do not cope wellwhere each enterprise has many relationships with other enterprisesystem, each of which is regularly changing.

However, if hierarchies are not to be used, we then encounter a newproblem that if anyone is to be able to create a channel identifier;they must be assured that it is unique, and preferably with no priorconfiguration or registration requirements.

Additionally, within indexed announcement schemes such as GAP(referenced previously), there is frequently the problem that channelidentifiers are repeated many times within index messages, thuscontributing to possible large index messages, and hence reducedbandwidth efficiency.

The invention is intended to address at least some of the aboveproblems.

SUMMARY OF THE INVENTION

The present invention overcomes at least the latter of the abovedescribed problems by using an announcement thread addressing formatwhich comprises a first sub-part concatenated with a second sub-part.The first sub-part is preferably the address of the party whichgenerates the addressing identifier, whereas the second sub-part may berandom data. An announcer apparatus may then use these address formatsby including only those parts of an announcement thread address whichrender the address unique within the particular index message in whichit is to be included, but not necessarily globally unique.

Moreover, the present invention overcomes the other problems by using anannouncement thread addressing format which comprises a meaningful partconcatenated with a meaningless part. The meaningful part is preferablythe address of the party which generates the addressing identifier,whereas the meaningless part may be random data. An allocator method andapparatus is therefore provided which acts to generate such announcementthread identifiers (AThIDs), and to allocate them to channels asappropriate.

In view of the above, from one aspect there is provided an announcementmethod for use in a publish-subscribe architecture, the methodcomprising: compiling an index message containing a plurality ofsequence identifiers respectively identifying a plurality of sequencesof messages, each message in each sequence relating to substantially thesame subject matter; and transmitting the compiled index message onto anindex channel; the method being characterised in that the sequenceidentifiers comprise at least two sub-parts, and the compiling stepfurther comprises, for any sequence identifier to be included within theindex message, including within the index message only those sub-partsof a sequence identifier which are necessary to uniquely identify thesequence identifier from the other sequence identifiers included withinthe message.

The first aspect has the advantage that only those sub-parts of asequence identifier which are required to identify the sequenceidentifier within the index message (i.e. relative to the other sequenceidentifiers in the index message) are included in the index message,thus shortening the length of the index message and improving bandwidthefficiency.

In a preferred embodiment, the first aspect further comprises the stepof requesting the allocation of a sequence identifier from an allocator;and receiving a message from the allocator containing the requestedsequence identifier. This allows fro allocation of sequence identifiersto be performed by a third party.

From another aspect there is provided a method of allocating a sequenceidentifier to a sequence of messages relating to substantially the samesubject matter and which are to be transmit onto one or morecommunications channels, the method comprising:

generating a meaningless sequence identifier part;

combining the generated meaningless identifier part with a meaningfulsequence identifier part to provide the sequence identifier; and

allocating the sequence identifier to the sequence of messages; whereinthe meaningless sequence identifier part is generated such that whencombined with the meaningful sequence identifier part the resultingsequence identifier is unique at least at that time, and wherein whenthe messages in the sequence are subsequently transmit, the identifieris at least partially incorporated therein so as to identify thesequence.

Preferably, a first sub-part of a sequence identifier is a networkaddress or other network locator. This allows for the degree ofpermanence required in the identifier, whilst allowing for a degree ofcontrol to be retained with the allocating party.

In an embodiment the first sub-part is preferably a Universal ResourceLocator (URL). This provides advantages in sequence identifierallocation due to the feature of a URL that it can represent both aprocess (e.g. a HTTP daemon) and persistent data stored on a machine. Itcan also be used to represent a programme dedicated to AThID allocation,which can be accessed through the generic process serving all URLs ofthat scheme, using techniques such as the common gateway interface(CGI).

Alternatively, the first sub-part may be an email address. This providesadvantages that it is easy for a human operator to remember.

In other embodiments of the invention the first sub-part is an InternetProtocol network address. This provides advantages in allocation in thatmost network entities are already allocated with IP addresses, and hencesuch an allocation scheme would be easy to implement.

Moreover, in embodiments of the invention a second sub-part of thesequence identifier is preferably a number, and furthermore ispreferably randomly generated. The use of numbers allows for convenientgeneration by a computer or other machine.

n a preferred embodiment, the number used as the meaningless part of thesequence identifier is produced by applying a hash function to datadefining the subject matter of the sequence of messages. This provides alink via the hash function between the actual definition of the subjectmatter of the sequence of messages and the resulting number, such thatif a new sequence identifier is required for different subject matter(i.e. the subject matter has been newly defined), a new number will beobtained as a result of the hash of the new definition.

In a preferred embodiment there is further included the step of checkingif the generated meaningless part of the sequence identifier has beenpreviously generated, and if so generating another meaningless sequenceidentifier part; wherein the checking and generating steps are repeateduntil a meaningless sequence identifier part is obtained which has notbeen previously generated. This ensures that the resulting obtainedsequence identifier is unique across the present usage space.

Additionally, preferred embodiments of the invention preferably furthercomprise the step of receiving a request for a sequence identifier, theallocating step then further comprising transmitting the subsequentlyobtained sequence identifier to the party or element from which therequest was received. Such functionality allows for third parties withpossibly unstable contexts themselves to request and obtain sequenceidentifiers from a possibly more stable identifier allocator.

From a further aspect, the invention also provides an announcementmethod for use in a publish-subscribe architecture, the methodcomprising: transmitting a sequence of messages relating tosubstantially the same subject matter on to one or more communicationschannels, the method being characterised by including in each message atleast part of a sequence identifier, the sequence identifier having beenallocated to the sequence as described above.

Additionally, from a yet further aspect there is also provided anannouncement method for use in a publish-subscribe architecture, themethod comprising transmitting an index message onto an index channel,the index message containing one or more sequence identifiersrespectively identifying one or more sequences of messages, each messagein each sequence relating to substantially the same subject matter, themethod being characterised in that the sequence identifiers arerespectively allocated to the sequences of messages as previouslydescribed.

From another aspect there is provided an announcement system for use ina publish-subscribe architecture, the system comprising: messagecompiling means arranged in use to compile an index message containing aplurality of sequence identifiers respectively identifying a pluralityof sequences of messages, each message in each sequence relating tosubstantially the same subject matter; and means for transmitting thecompiled index message onto an index channel; the system beingcharacterised in that the sequence identifiers comprise at least twosub-parts, and the message compiling means is further arranged tooperate, for any sequence identifier to be included within the indexmessage, to include within the index message only those sub-parts of asequence identifier which are necessary to uniquely identify thesequence identifier from the other sequence identifiers included withinthe message.

A further aspect also provides an apparatus for allocating a sequenceidentifier to a sequence of messages relating to substantially the samesubject matter and which are to be transmit onto one or morecommunications channels, the apparatus comprising:

identifier part generation means for generating a meaningless sequenceidentifier part;

sequence identifier generation means arranged to combine the generatedmeaningless identifier part with a meaningful sequence identifier partto provide the sequence identifier; and

sequence identifier allocating means for allocating the sequenceidentifier to the sequence of messages; wherein the meaningless sequenceidentifier part is generated such that when combined with the meaningfulsequence identifier part the resulting sequence identifier is unique atleast at that time, and wherein when the messages in the sequence aresubsequently transmit, the identifier is at least partially incorporatedtherein so as to identify the sequence.

Within these further aspects the corresponding advantages and furtherfeatures may be obtained as already described above in respect of thefirst aspect and second aspect respectively.

From another aspect, the present invention further provides a computerprogram or suite of programs so arranged such that when executed by acomputer system it/they cause/s the system to perform the method of anyof the above described aspects. The computer program or programs may beembodied by a modulated carrier signal incorporating data correspondingto the computer program or at least one of the suite of programs, forexample a signal being carried over a network such as the Internet.

Additionally, from a yet further aspect the invention also provides acomputer readable storage medium storing a computer program or at leastone of suite of computer programs according to the aspect describedabove. The computer readable storage medium may be any magnetic,optical, magneto-optical, solid-state, or other storage medium capableof being read by a computer.

BRIEF DESCRIPTION OF THE DRAWINGS

Further features and advantages will become apparent from the followingdescription of an embodiment of the invention, presented by way ofexample only, and by reference to the accompanying drawings, wherein:

FIG. 1 is a system block diagram of the general system architecture inwhich the invention is intended for use;

FIG. 2 illustrates an announcement message format used by theannouncement system in which the invention is used;

FIG. 3 is a message sequence diagram illustrating the sequence ofmessages that are transmitted in an embodiment of the invention;

FIG. 4 is a flow diagram illustrating the steps performed by anallocator in the embodiment of the invention;

FIG. 5 illustrates a relative sequence identifier provided by anembodiment of the invention;

FIG. 6 illustrates the binary format of a sequence identifier providedby the embodiment of the invention;

FIG. 7 illustrates how several sequence identifiers may be combined intoa single index announcement message in an embodiment of the invention;and

FIG. 8 is a flow diagram illustrating the operation of an announcer inan embodiment of the invention when using the sequence identifier formatpresented herein.

DESCRIPTION OF THE EMBODIMENTS

An embodiment of the invention will now be described with respect toFIGS. 1 to 7.

FIG. 1 illustrates a publish-subscribe architecture which constitutesthe operating environment of the present invention. This will bedescribed next, and the terminology to be used herein defined thereby.

In FIG. 1 an announcing application 10 is provided running on a computersystem or the like (not shown). The announcing application operates togenerate or otherwise process information which is to be announced bytransmission of a message (an announcement) relating to a predefinedsubject onto a communications channel 18. The scope of the operation ofthe announcing application 10 as used herein is deliberately broad, asthe announcing application could be any application which producesinformation relating to any characteristic of any sort of entity. Asexamples, an announcing application 10 could be installed on atemperature sensor, and which acts to periodically announce thetemperature sensed by the sensor. In another example, the announcingapplication could be located as part of the system of a stock exchange,and act to announce the share price of a particular share, or the indexlevel of a stock index. In another application, the announcingapplication could be used in a distributed programming environment totrack the value that an internal variable to a program takes, and toproduce information relating to the value of that variable.

The announcing application 10 communicates with an announcer 12. Theannouncer 12 is a software programme forming part of a communicationmiddleware that is given information by other locally running programmes(i.e. the announcing application 10) to announce information globallybut efficiently to any interested parties by virtue of the transmissionof messages onto the communications channel 18. ‘Locally’ here usuallymeans on the same computing device, but an announcer 12 may be arrangedon one device to act for a number of locally connected devices.

Additionally provided as part of the publish-subscribe architecture is alistener 16. The listener 16 is another software programme which formspart of the communication middleware. It receives the messages sent bythe announcer 12 on the appropriate communications channels 18. Thelistener 16 acts to communicate with a listener application 14, which isthe application which makes use of the information provided by theannouncing application 10. Thus, continuing the examples given above,the listening application 14 could be an industrial control applicationwhich acts to control an industrial process in response to thetemperature sensed by the temperature sensor, and communicated to thelistener 16 in a message from the announcer 12.

It should be noted here that the announcer 12 and listener 16 arecompletely decoupled, which means that the announcer 12 does not need tohave any information about the identity, the credentials and the numberof listeners.

When the announcing application 10 continually updates and produces newinformation relating to the data, object or entity to which it relatesat each update a new announcement message is created and transmitted bythe announcer 12. We define such a sequence of related announcementmessages to be an “announcement thread”, with each individual message inthe sequence being an “announcement version”. A new version of anannouncement (an announcement version) is assumed to contain informationrelated to previous versions in some way specific to the applicationmaking the announcements.

An announcement message is therefore a new announcement version of anannouncement thread, and could occur at any unknown time in the future.The new announcement version expresses an update of specific informationrelating to the data, objects, or entities which the announcingapplication is monitoring.

Within such an architecture there is a clear need to be able to identifyannouncement threads, being the sequence of messages transmit onto thecommunications channel 18. This is so listeners can receive anannouncement message and know to which thread the announcement messagerelates and thereby determine the subject matter of the message.Usually, the subject matter of an announcement thread will have beendefined in advance.

Therefore, in order to allow such identification, each announcementthread is provided with an announcement thread identifier (AThID), whichis the globally unique identifier for an ANNOUNCMENT THREAD. Within anannouncement message, both the announcement thread identifier 201 andthe announcement version 202 (usually a numeric value) are included, asshown in FIG. 2.

In order to provide for globally unique AThIDs, an allocator 20 isprovided. An allocator 20 is an entity that creates AThIDs for every newannouncement thread at the request of an announcer 20. The allocator 20is therefore arranged to communicate with the announcer 12, usually overthe communications channel 18. The allocator 20 is preferably a softwareapplication running on a host computer system, but could in someembodiments be a human.

Note here that the allocator 20 and the announcer 12 are completelydecoupled. An allocator 20 and an announcer 12 communicate together onlyfor the creation of a new AThID.

For use within such an architecture, an AThID must have certainproperties. Firstly, an AthID should be globally unique across all thespaces where it may eventually become relevant. This is because theidentifier may become relevant to a context that did not exist when theidentifier was created. Allowing listener mobility is enough to requireglobal uniqueness.

Secondly, preferably such AThID's should not be subject to ahierarchical registration scheme. An obvious solution to the problem ofAThID allocation would be to create unique identifiers by registeringthem with a hierarchical registration system with a single global root.However, open systems that allow people and programmes to create newobjects autonomously are preferable over those requiring registration.Even where registration is delegated hierarchically, creation of thehierarchy becomes an obstacle to immediate use of the system. Also, aregistration hierarchy is often perverted into a permission hierarchy bythose that control it. For these reasons we do not favour suchregistration schemes.

A third factor to be considered is the stability of the AThID. If wereject uniqueness by registration, an alternative is to allocateidentifiers that are only unique to a pre-existing unique identifier ofthe allocator, then concatenate the two. However, by doing this, we aremaking the identifier relative to one of its parent contexts. But,because every set of objects exists in multiple contexts, we then haveto guess which parent context is going to outlive all the others.Therefore, we have to carefully choose which pre-existing uniqueidentifier to use, to ensure it will rarely be in a context that may diebefore its children.

Additionally, an AThID must be designed in a simple manner so that theycan be used efficiently with application such as HTTP, SNMP, LDAP thatuse an ASCII representation so an ASCII scheme is required.

In order to meet the above requirements, in the present invention wepropose a preferable ASCII representation for an absolute AThID, andwhich consists of three mandatory parts concatenated together with theidentifiers and separators as shown below:

“ath:” <Scheme id> “=” <Allocator id> “$” <Announcement thread number>

We also present a corresponding binary representation, but this will bedescribed later.

Within the ASCII representation the prefix “ath:” indicates that thestring is an AThID, and the following string gives the scheme ID. Thescheme ID indicates to the listener which receives a message containingsuch an AThID what the format of the rest of the AThID will be, and inparticular what form the Allocator ID field (AllID) will take. Wepresent a number of possible schemes below, and recommend one of them.However, for future proofing, we still include the ability for newallocation schemes to be introduced by including the scheme identifierin every full AThID.

Following the Scheme ID field is an “=” sign, after which the AllocatorID is included. This is an identifier or address code which uniquelyidentifies the allocator 20 which generated the AThID. This is themeaningful part of the AThID, as it indicates to a recipient who theallocator 20 was which generated the AThID. The format of the AllID willdepend on the scheme, which as mentioned will be described.

Following the AllID is a “$” symbol, after which there is included anannouncement thread number field. The announcement thread number (ATh#)may be any integer in the range 1-65535. ATh#=0 is reserved (for reasonsonly relevant when we introduce the binary representation). We do notallow textual ATh#s to avoid the emotional or commercial attachmentspeople would otherwise carry for certain names.

For efficiency of other parts of the system, particularly binary indexrepresentations (see later), allocation of ATh#s must not bias towardsany specific value.

Therefore, allocation of announcement thread numbers is preferablyrandom within the available number space, and hence the actual numberchosen carries no meaning.

Moreover, it will be appreciated that in other embodiments numbers maybe replaced with letters, or with alphanumeric sequences.

In the preferred embodiment lower case insensitive text strings are usedto represent each scheme ID in the ASCII representation of an AThID (seethe column headed SchTx in Table 1 below). The binary scheme identifiermay be any of 0-15 but we only use one code point (1) from the 16 inthis space for our recommended scheme, as will be described. We wouldexpect new scheme identifiers (both their binary and ASCIIrepresentation) to be registered by the Internet Assigned NamesAuthority (IANA).

Similarly, the new “ath:” URI scheme will need to be registered withIANA.

Some candidate schemes for allocator IDs are given in Table 1. All but acouple of the candidate allocator identifier schemes use pre-existingidentifiers that are already unique. TABLE 1 Candidate allocatoridentifier schemes Binary SchID SchTx width/b Description Notes — IPv4 32 IPv4 addr of allocator — IPv6 128 IPv6 addr of allocator — MAIL varE-mail address of owner of allocator 1 URL var URL of allocator — IANA ?IANA assigned allocator id (hierarchical) — GAP ? Allocator id claimedon well-known GAP channel

A first possible scheme is the use of an IP ADDRESS SCHEME. This schemeuses an IP address as an allocator ID and is very easy to set up.However to be effective it requires that the (possibly many) operatorsof that machine remember which AThIDs have been allocated under thatallocator id. Otherwise it is possible that a new operator might not betold that the machine had a set of AThIDs associated with this IPaddress. That means that different operators could use a similar AThIDfor different purposes.

An alternative scheme is the MAIL SCHEME. This scheme uses anindividual's email address as an allocator ID. However an email addressis not a very stable allocator and it could be changed and taken from anallocator without the allocator's control. This suggests using a neutraladdress like AThIDmaster@macdonalds.farm.com, but still leaves theproblem of name changes.

A third possible scheme is a URL SCHEME. This scheme uses a uniformresource locator (URL) as an AThID allocator id. The neat feature of aURL is that it can represent both a process (e.g. a HTTP daemon) andpersistent data stored on a machine. It can also be used to represent aprogram dedicated to AThID allocation, which can be accessed through thegeneric process serving all URLs of that scheme. Therefore, an allocatoridentifier can be chosen with a likely persistence that should outliveall the AThIDs it will allocate. A human allocator (if used) is notlimited to choosing an allocator identifier under her control andtherefore in a transient context. For instance highly persistentorganisations can set up a simple AThID allocator programme accessiblethrough their CGI.

Therefore, we recommend the URL scheme because a URL can be as stable oras volatile as required, and no-one is restricted to only use URLswithin their own contexts, because URLs can be made available to anyonefrom anywhere on the Internet. An example AThID using our recommendedURL scheme for the allocator identifier would look as follows:

<ath:URL=http://www.hosting.org/AThID?set=farm$31425>

Note that an AThID contains a URL when using the URL scheme for theallocator id, but it is not strictly a URL itself—it is a uniformresource identifier (URI), meeting all the definitions and requirementsof a URI. An AThID URI doesn't locate information. Rather, an AThID isused indirectly to reference configuration information that locatesobject versions in both space and time, even though announcement timingis unknown in advance. On this basis, one might argue that most resourcelocators do not directly locate their resource either, nor do theycontain sufficient information to locate it indirectly either. Forinstance, an HTTP URL does not usually locate information directly; ifit contains a hostname it relies on configuration information in a DNS.An HTTP URL doesn't even contain the IP address of any DNS resolver eventhough it depends on one. However, we can still say that an HTTP URL isa locator, because it only relies on static configuration informationthat is not unique to the resource being located. An AThID, on the otherhand, is not a locator, because it relies on further configurationinformation specific to the resource in question. Thus, an AThID is anidentifier, only locating a resource when used as the key into to alocal database of configuration information collected earlier.Nevertheless, we have chosen to ensure that the syntax we define for anAThID meets all the requirements for a URL, because the motivation formost of these requirements is unchanged whether dealing with identifiersor locators.

Where a number of AThIDs appear within one context (e.g. a list), toavoid repetition of similar material, we can define a RELATIVE ATHID.For instance, if the context had already defined the base URI as<ath:URL=http://www.hosting.org/AThID?set=farm> then the relative URI<$31425> would suffice to specify the above absolute AThID. Even if thebase URI had a different ATh# appended, the new relative URI wouldsupersede it, as specified in the rules on parsing relative URLs inRFC1808 (as updated by RFC2368 and RFC2396) (assuming again that themotivations for relative URL rules are unchanged for URIs). Note that anAThID without an ATh# appended is invalid.

Within our ASCII representation “ath:” is the URI's scheme name, and isalso optional for a relative AThID. But if the allocator identifier ispresent, it must be preceded by its own allocation scheme identifier(e.g. “URL=”). The allocator identifier deliberately does not start witha “//” signifying that there is no network location and we are not usinggeneric resource locator syntax, preventing further processing as arelative URL. However, the URL used for the allocator identifier mayitself be relative to a base URL, if and only if the context of therelative URL of the allocator identifier is clearly distinguishable fromthe context of the whole AThID URI.

When the optional “ath:” prefix isn't present, the resulting relativeAThID bears a passing similarity to the URL of a non-AThID scheme.However, a valid URL would start with “URL:” not “URL=”. Because of thispotential ambiguity, this relative form must only be used in contextswhere only an AThID would be expected by human users.

Having described the ASCII representation of our preferred AThID format,we now describe a binary representation.

The proposed binary representation of an absolute announcement threadidentifier (AThID) is similar but not the same as the ASCIIrepresentation. One difference is that the context in which binaryrepresentations will be used make any prefix like “ath:” redundant. Abinary AThID consists of three parts concatenated together (we use ‘|’to represent concatenation):

<Scheme id> | <Allocator id> | <Announcement thread number>

Here, the ANNOUNCEMENT THREAD NUMBER (ATh#) is a 16 bit integer. ATh#=0is reserved. Additionally, the SCHEME ID is a 4 bit integer, with onlyone code-point defined, SchID=1 meaning the URL scheme alreadyrecommended above, as shown in the ‘SchID’ column of Table 1.

The form of the allocator identifier depends on which scheme identifieris used. Clearly, if the IPv4 or IPv6 schemes were used, the allocatoridentifier would simply be the 32 or 128 bit IP address respectively.For the URL scheme, the allocator identifier is just the string ofoctets that are identical to the ASCII allocator id.

Relative binary AThIDs as described above would be expected to beextremely common. They must only consist of the ATh# alone, resulting ina simple binary representation as shown in FIG. 5. Here it will be seenthat only the 16-bit Ath# is given.

The above definitions of the AThID parts do not give any clue as to thebit width of an absolute binary AThID, unless the scheme identifierimplies a fixed width allocator id (such as in the case of the IPv4 orIPv6 allocator ID schemes). Therefore, we recommend using therepresentation convention shown in FIG. 6 for binary AThIDs inprotocols, and in particular in binary announcement messages.

Within FIG. 6 the leading 16 bits of zeroes allows an absolute AThID tobe distinguished from a relative one (recall that zero is a reservedvalue for the ATh#). The 12 bit AllID length field gives the length ofthe AllID field in 32 bit chunks, making the maximum allowable allocatorID 16,384B (for efficiency, it would be wise to keep the length as shortas possible. Also, although there is no specified limit to URL length,in practice most URL handling software has a limit. Very early versionsof some Mosaic-derived browsers had a 256 character URL limit, whileMicrosoft Internet Explorer (v5.5 at least) has a limit of 2,083characters. Server software may also be limited, although Apache canhandle up to about 8 kB URLs). For AllIDs that do not require a wholemultiple of 4 octets, the remnant is padded with zeros. All ASCIIallocator identifier schemes should not allow the null character. TheAllID length field is redundant if SchID implies a fixed width allocatorid, but it saves knowledge of new scheme ids having to be embedded inprotocol parsers.

The binary AThID convention set out above inescapably means that thewidth of a binary AThID is unpredictable without reading the first word,parsing it, then reading the second word if necessary, then parsing thattoo. However, given that this is an application layer protocol, we areconcerned about performance issues, because index announcements areprocessed very repetitively but we need not be concerned beyond acertain point.

We now give an example of the use of this binary representation in anindex announcement message, with reference to FIG. 7 which shows thebinary layout of the payload of such a message. An index announcementmessage is simply a table of AThIDs against their respective versionnumbers, which are 16 bit integers. Index announcement messages as usedin the context of the GAP publish-subscribe system are described in ourearlier International patent application WO01/99348, as referencedearlier, the contents of which necessary for understanding the formatand use of index announcement messages being incorporated herein byreference.

Within an index announcement message each AThID may well have adifferent allocator ID, but relative AThIDs may be used nearly all thetime, because each listener of the index has been previously told thatthe absolute AThID they are interested in will be in specific indexannouncement on a specific channel. Therefore, as long as it is uniquewithin the index, each ATh# will imply the absolute AThID that ends withthat ATh#. Therefore, all the index announcer has to do is include theabsolute AThID for any pairs of AThIDs that happen to have identicalATh#s. Thus the payload of an index announcement might look as shown inFIG. 7.

Here, Ath#_4 would appear twice, so the announcer qualifies bothoccurrences of it with the full, absolute AThID specification. For allthe other AThIDs (1-3,5,6) the short, relative AThID is sufficient.

If it became necessary to continually repeat an allocator ID because ofa clash, it would be possible to define an abbreviated symbol for it, asis done in XML namespaces. In a way, this is similar to the internalsymbols used when compressing data.

FIG. 8 illustrates an example process to allow an announcer 12 in apublish-subscribe system architecture such as that shown in FIG. 1 toperform the above described operation using relative AThIDs to reducethe size of index messages.

Firstly, imagine an announcer 12 is to compile an index message fortransmission on the communications channel 18. The announcer 12 willhave been in contact with one or more announcing applications 10 andwill have received indications from them that a respective announcementfor those applications is required. Preferably, an announcingapplication 10 passes announcement information to the announcer 12regarding the AThID and version number for each announcement which itrequires. The announcer 12 receives this information from eachannouncing application which it serves and stores it for use whencompiling a new index message.

In order to compile a new index message the process shown in FIG. 8 maybe used. Here, first of all the announcer 12 retrieves the storedinformation regarding those AThIDs and version numbers for whichannouncements must be made at step 8.2. Then, at step 8.4 for eachretrieved AThID and version number a check is performed to see if theAth# of the AThID is already in the index message. If not then it isdetermined that the Ath# itself will be sufficient to identify theannouncement thread within the index message without any furtherinformation being required, and hence processing proceeds to step 8.10,wherein the Ath# and the version number from the AThID are placed intothe payload of the index message (see FIG. 7). Then, processing proceedsto step 8.12, wherein it is determined whether or not there are anyfurther announcements to be placed in the index message payload, and ifso then processing proceeds back to step 8.2, and the procedure beginsagain. Essentially, step 8.12 causes the process to be repeated forevery announcement which the announcer has buffered and waitingannouncement.

Returning to step 8.4, if it is determined here that an Ath# is alreadywithin the payload of the index message being compiled then it will benecessary to include further information relating to the AThID of theannouncement to be included within the message, if the announcement isto be capable of unique identification. Thus, if this is determined tobe the case at step 8.4 then processing proceeds to step 8.6 wherein thefull AThID of the announcement is obtained from the announcer's localmemory store, and at step 8.8 the full AThID is then placed within theindex message payload. Processing then proceeds to step 8.12, whereinthe evaluation as to whether all of the announcements have been includedin the message payload is made, as described above.

Following the procedure outlined above, the full AThID is only used inthe announcement message when it is necessary because an announcementwith the same ATh# as an announcement to be included in the indexmessage is already present therein. At other times, only the ATh# isused, thus resulting in a much reduced payload within the index messagethan would be the case if the full AThID were to be used for everyannouncement.

Having described the AThID format provided by the present invention, andalso the operation of an announcer when using the format, we nowdescribe the operation of an allocator program which is able to performthe task of the allocator 20 in the architecture described above.

A managed allocator programme could be very rudimentary. It would onlyneed parameters that allowed a user (i.e. an Announcer 12) to performthe following functions:

-   -   i) Register new AThIDs (respecting the above requirement that        the choice of ATh#s is not biased to certain parts of the number        space);    -   ii) Unregister an existing AThID (see later); and    -   iii) There may also need to be methods to create and destroy        sets of AThIDs (e.g. the set ‘farm’ in the example above).

An allocator programme might optionally support association of textualstrings with AThIDs as they are created, in order to providehuman-readable descriptions of announcement threads. We will discuss theassociation of a textual string to an AThID (XML file) in the exampleoperation given below.

Returning to FIG. 1, imagine that the announcing application 10 requiresa new AThID. In such a case a request for a new AThID will be made fromsoftware associated with the announcing application, to the allocator20.

In order to do this, within the described embodiment the announcingapplication generates a human readable description of the information tobe announced. This is a description of the subject matter of theannouncement thread to which the desired AthID will be applied. Thedescription could be a simple .txt file or a .doc file etc. However oursuggestion is to use the extensible Markup Language (XML). We use XMLbecause it offers a unique combination of flexibility and simplicity byboth humans and machines.

An example human-readable description of the information XML file isgiven below: <?xml version=“1.0” standalone=“yes”?><HEADER><HEADLINE>GAP Announcement</HEADLINE> </HEADER><FROM>alice@company.com</FROM> <DATE>Feb. 2, 2003</DATE> <ITEM><DESCRIPTION>Standard version for 3G protocol release 3.0 </DESCRIPTION>

</ITEM>

The description of the announcement thread is contained in the sectionsmarked <DESCRIPTION> </DESCRIPTION>, whereas the section marked<VALUE>represents a random number that is used to generate different ATh#. Iftwo announcement threads with different descriptions were to be giventhe same Ath#, then the random value is changed by the allocator 20 inorder to maintain the uniqueness of the ATh#. The announcing application10 generates a random number simply for data handling process reasons.

The request from the announcing application 10 to the allocator 20consists of an HTTP request/reply as illustrated in FIG. 3. Theannouncing application 10 sends a POST request containing: the URL ofthe ALLOCATOR, the protocol version and a MIME-like message containingthe description of the information to be announced. The server runningthe allocator program then subsequently responds with a status line,including the message's protocol version and a success or error code,followed by a MIME-like message containing the information of the AThIDthat has been allocated.

In more detail, the HTTP communication is initiated by a user agentassociated with the announcing application 10 and consists of a requestto be applied to a resource on some server. The HTTP communicationusually takes place over TCP/IP connections. The default port is TCP 80,but other ports can be used. This does not preclude HTTP from beingimplemented on top of any other protocol on the Internet, or on othernetworks. HTTP only presumes a reliable transport; any protocol thatprovides such guarantees can be used. In this design we use HTTP v1.1but other version could be used.

The POST HTTP method is used to request that the allocator programaccepts the entity enclosed in the request as a new subordinate of therequest URL in the request line. POST is a HTTP method designed toprovide a block of data to a data handling process. If the entityenclosed is passed correctly to the data handling process in theallocator an OK answer is sent back including an entity that describesthe AThID.

Upon receipt of the POST request, the allocator 20 then performs thefollowing steps (more precisely, the host computer hosting the allocatorprogram performs the following steps under the control of the program).

Having received the request at step 4.2. the next step (s.4.4) is that,if required, the allocator ID is generated. Usually this step would notbe carried out, for the reason that the allocator ID is preferably apre-defined URL (or email address or IP address, as we describe above).However, in some embodiments both a new allocator ID and an ATh# may becombined to form an AThID, and hence this step is provided as anoptional step.

Following step 4.4, at step 4.6 the received XML script which providesthe human- and machine-readable description of the subject matter of theannouncement thread is stored in a local store 22 provided at theallocator 20. This is so that a record is kept at the allocator of theannouncement threads for which an AThID has been issued.

Next, at step 4.8, The allcoator program then hashes the descriptioncontained in the XML file and the random number contained in the valuefield to give the Announcement Thread Number. That is, the ATh# is givenas follows:

ATh#=md5(XML <DESCRIPTION>, XML <VALUE>)

As we mentioned above, an ATh# preferably consists of 16 bit, althoughthe preferred hash function is MD5, which gives a 128-bit output. Theoutput of the hash function is therefore truncated to the first 16 bitsto obtain the ATh#.

Following the generation of the ATh#, a check is performed next at step4.9 to check that the generates ATh# is unique in the context of theparticular allocator (note that it does not have to be globally uniqueacross all available allocators, but only unique in the context of thrallocator ID with which it will be combined). This check is performed bymatching the generated ATh# with previously generated ATh#s, which arestored in the local store 22. If it is determined that in fact thegenerated ATh# is not unique i.e. the allocator has produced that ATh#before and has combined the ATh# with the same allocator ID which is tobe used in the present case, then a different ATh# must be obtained.This is produced by generating a further random number value which isthen substituted into the <value> field of the XML script, and the hashfunction is applied to this modified data to give a further hash value,which is once again truncated to 16-bits. This further ATh# value isthen compared to see if it is unique within the given context. Thisprocess is repeated until a unique ATh# is obtained.

Having obtained a unique ATh#, next at step 4.10 the whole AThID isgenerated by concatenating the obtained ATh# with the allocator ID usedby the allocator. As we explained previously, the allocator ID ispreferably a URL. The concatenation is performed according to the AThIDformat described previously, and hence an AThID of the form:

“ath:” <Scheme id> “=” <Allocator id> “$” <Announcement thread number>as an ASCII representation, or of the form:

<Scheme id> | <Allocator id> | <Announcement thread number> for a binaryrepresentation is obtained.

Having generated the full AThID, at step 4.12 the allocator 20 acts tostore the generated AThID in the local store 22. The AThID is storedreferenced to the XML description of the announcement thread for whichit is generated. As discussed above, the purpose of storing the AThID isto allow a comparison of newly generated AThIDs with previouslygenerated AThIDs.

Finally, at step 4.14 the allocator 20 transmits the generated AThIDback to the requesting announcer as part of the OK response to the POSTrequest. The announcer 12 can then use the AThID in any announcementmessages belonging to the announcement thread.

We now describe further embodiments which introduce additionalfunctionality to the embodiments described above.

The embodiment described above does not include security requirements.Therefore, in another embodiment the session is initiated using HTTPprotocol and the known Security Socket Layer. In such a case theallocator 20 obtains knowledge of the announcer that has requested a newAThID. Exploiting this option the allocator stores the XML fileassociated with the certificate of the announcing application. Thisoption gives the possibility to the allocator to restrict the allocationof AThIDs to specific announcers.

A further embodiment makes provision for the prevention of Denial ofService (DoS) attacks. A simple DoS attack could prevent the abovedescribed embodiments from working properly. A malicious announcer couldflood an allocator with different AThID requests. The allocator would inthe normal course of operation as described above allocate as manyAThIDs as the number of requests. In this scenario the number of uselessAThID allocated would be very high reducing the space and the resourcefor real AThID.

In order to mitigate this attack scenario, in a further embodiment werequire that the allocator 20 after sending the HTTP OK does not storethe ATHID but instead requests an acknowledgement from the announcercontaining the previous and the current random number. If the requestedacknowledgement is not received the allocator times out the request.With such a simple method we require the announcer to maintain somecomputing resource for each AThID request sent, and hence it will not bepossible for the announcer to flood the allocator with AThID requests.

In a further embodiment, an announcer could have the ability to allocatea large number of ATHIDs to a specific announcing application: in thiscase the AThIDs could all be regrouped under a specific context (forexample a directory in a URL). For commercial reasons it may beimportant that the user does not specify the specific context, it is theallocator that provides this function. For example an allocated AThIDcould look like:

<ath:URL=http://www.hosting.org/AThID?set=farm$31425>

In this example the allocator has allocated a specific set of ATh#called “farm” for a specific announcing application.

A more complicated embodiment could provide the feature of creating aset of AThIDs without receiving requests from the announcer. In thiscase we require the allocator to ask for feedback from the listenerpopulation and to aggregate together in a specific set AThIDs that havesimilar interests. This option could be very useful since it allows thecreation of logical structures of different ATHIDs based on userexperience: in this case based on user feedback. The only informationrequired from the announcer is the XML file that can be used togetherwith user feedback.

Such a scheme could be very useful to allow searching of similar ATHIDswithout the need to go to the announcing application (for example in asearch engine).

We turn now to the issue of how to deregister an existing AThID Theprocess of deregistration is difficult to define. The problem is that anAThID can be used by different applications. Different applicationscould use the same AThID to exchange particular software updates indifferent and separate contexts. A single user cannot decide toderegister or delete a specific AThID since it could be used by anotherapplication that the user cannot control. However there are requirementsto deregister an AThID because it could become obsolete after a certainamount of time.

In order to get around the above problem we propose two methods thatallow users to deregister an existing AThID:

i) TIME TO LIVE (TTL). In one embodiment the AThID is associated with aparticular time-to-live that is stored on the allocator. Thistime-to-live information represents a time stamp (date) after which theAThID will be discarded. To avoid an AThID being discarded the allocatorneeds to receive a refresh message. This refresh message can be transmitby any announcing applications that are using the specific AThID. Assoon as the TTL is renewed the allocator can announce such to otherannouncing applications. If the TTL is not refreshed before the deadlinethe AThID is silently discarded by the allocator.

ii)Announcing application owns the ATHID. In this embodiment only aspecific announcing application can use and manage a particular ATHID.The announcing application can decide when to delete an announcement.The effect of an ATHID discarded does not influence other applicationsbecause it is only announced by a specific application.

The implementation of this scheme requires a POST HTTP messagecontaining the parameter of the ATHID to be deleted. It is importantthat the option to delete an AThID is only allowed when a securityscheme in place.

In conclusion, therefore, the addressing scheme we describe isparticular efficient in a scenario such as GAP, where an Ath# has tomaintain is uniqueness properties within a well-specified Multicastchannel, and the full AThID is only used when a collision is present onthe channel. Notice here that an address (if needed) can be referred toa particular user/machine but this is not in the requirement.

With regards to the application of the invention to other messagingschemes, large scale messaging scheme usually require that theinformation is accessible everywhere in the network in an efficient andunique way. The addressing scheme we have described uses a process thatallows a stable and unique identifier to be used by different messagingsolutions in a seamless manner. The same AThID can be used to addressthe same information on different platforms and provided by differentusers.

Our addressing scheme provides two main advantages:

i) The ability for anyone to allocate an AThID using anyone else'sallocator, allowing an allocator of suitable stability to be chosen foreach thread in question, rather than having to use one in one's own(possibly insufficiently stable) context.; and

ii) The ability to generate announcement addresses comprising agenerator ID and a preferably random announcement ID, and allowing thesetwo parts to be exploited differently depending on the specific context.

We conclude with an example of a possible commercial use of ouraddressing scheme.

Here, an organization that is renowned in terms of stability allocates astable allocator ID to be used for AThIDs. For example, we may imagine ageneral identifier for software updates for the 3G protocol beingprovided by a stable organisation such as the IEEE, which allocates aunique identifier for this subject. Thanks to the generated AnnouncementThread Number being combined with the allocator ID the resulting AThIDis random enough to avoid ownership disputes in the future(characteristic of the classic URL scheme). It is important to noticethat the resources of the stable allocator are separated from any otherresources when the AThID is used, such that organisations like the IEEEare not discouraged from offering such a service. The service consumes amicroscopic resource and never requires them to arbitrate over ownershipof names.

Unless the context clearly requires otherwise, throughout thedescription and the claims, the words “comprise”, “comprising” and thelike are to be construed in an inclusive as opposed to an exclusive orexhaustive sense; that is to say, in the sense of “including, but notlimited to”.

1. An announcement method for use in a publish-subscribe architecture,the method comprising: compiling an index message containing a pluralityof sequence identifiers respectively identifying a plurality ofsequences of messages, each message in each sequence relating tosubstantially the same subject matter; and transmitting the compiledindex message onto an index channel; the method being characterised inthat the sequence identifiers comprise at least two sub-parts, and thecompiling step further comprises, for any sequence identifier to beincluded within the index message, including within the index messageonly those sub-parts of a sequence identifier which are necessary touniquely identify the sequence identifier from the other sequenceidentifiers included within the message.
 2. A method according to claim1, and further comprising the step of requesting the allocation of asequence identifier from an allocator; and receiving a message from theallocator containing the requested sequence identifier.
 3. A method ofallocating a sequence identifier to a sequence of messages relating tosubstantially the same subject matter and which are to be transmit ontoone or more communications channels, the method comprising: generating afirst sub-part of a sequence identifier part, the first sub-part beingsemantically meaningless when considered alone; combining the generatedfirst sub-part of the identifier with a second, meaningful, sequenceidentifier sub-part to provide the sequence identifier; and allocatingthe sequence identifier to the sequence of messages; wherein the firstsequence identifier sub-part is generated such that when combined withthe second sequence identifier sub-part the resulting sequenceidentifier is unique at least at that time.
 4. A method according toclaim 1 wherein a first sub-part of a sequence identifier is a networkaddress or other network locator.
 5. A method according to claim 4,wherein the first sub-part is a Universal Resource Locator (URL).
 6. Amethod according to claim 4, wherein the first sub-part is an emailaddress.
 7. A method according to claim 4, wherein the first sub-part isan Internet Protocol network address.
 8. A method according to claim 1,wherein a second sub-part of a sequence identifier is a number.
 9. Amethod according to claim 8, wherein the number is randomly generated.10. A method according to claim 8, wherein the number is produced byapplying a hash function to data defining the subject matter of thesequence of messages.
 11. A method according to claim 3, and furthercomprising generating the meaningful part of the sequence identifier, ifrequired.
 12. A method according to claim 3, and further comprisingchecking if the generated meaningless sub-part of the sequenceidentifier has been previously generated, and if so generating anothermeaningless sequence identifier sub-part; wherein the checking andgenerating steps are repeated until a meaningless sequence identifiersub-part is obtained which has not been previously generated.
 13. Amethod according to claim 3, and further comprising the step ofreceiving a request for a sequence identifier, the allocating step thenfurther comprising transmitting the subsequently obtained sequenceidentifier to the party or element from which the request was received.14. An announcement method for use in a publish-subscribe architecture,the method comprising: transmitting a sequence of messages relating tosubstantially the same subject matter on to one or more communicationschannels, each message in the sequence including at least part of asequence identifier, the method being characterised in that the sequenceidentifier is allocated to the sequence in accordance with claim
 3. 15.An announcement method for use in a publish-subscribe architecture, themethod comprising transmitting an index message onto an index channel,the index message containing one or more sequence identifiersrespectively identifying one or more sequences of messages, each messagein each sequence relating to substantially the same subject matter, themethod being characterised in that the sequence identifiers arerespectively allocated to the sequences of messages in accordance withclaim
 3. 16. A computer program or suite of computer programs arrangedsuch that when executed on a computer system it or they cause thecomputer system to operate in accordance with the method of claim
 1. 17.A computer readable storage medium storing the computer program or atleast one of the suite of computer programs according to claim
 16. 18.An announcement system for use in a publish-subscribe architecture, thesystem comprising: message compiling means arranged in use to compile anindex message containing a plurality of sequence identifiersrespectively identifying a plurality of sequences of messages, eachmessage in each sequence relating to substantially the same subjectmatter; and means for transmitting the compiled index message onto anindex channel; the system being characterised in that the sequenceidentifiers comprise at least two sub-parts, and the message compilingmeans is further arranged to operate, for any sequence identifier to beincluded within the index message, to include within the index messageonly those sub-parts of a sequence identifier which are necessary touniquely identify the sequence identifier from the other sequenceidentifiers included within the message.
 19. A system according to claim18, and further comprising means for requesting the allocation of asequence identifier from an allocator; and means for receiving a messagefrom the allocator containing the requested sequence identifier.
 20. Anapparatus for allocating a sequence identifier to a sequence of messagesrelating to substantially the same subject matter and which are to betransmit onto one or more communications channels, the apparatuscomprising: identifier part generation means for generating a first,meaningless, sequence identifier sub-part; sequence identifiergeneration means arranged to combine the generated meaninglessidentifier part with a second, meaningful, sequence identifier sub-partto provide the sequence identifier; and sequence identifier allocatingmeans for allocating the sequence identifier to the sequence ofmessages; wherein the first sequence identifier sub-part is generatedsuch that when combined with the second sequence identifier sub-part theresulting sequence identifier is unique at least at that time.
 21. Asystem according claim 18, wherein a first sub-part of a sequenceidentifier is a network address or other network locator.
 22. A systemaccording to claim 21, wherein the first sub-part is a UniversalResource Locator (URL).
 23. A system according to claim 21, wherein thefirst sub-part is an email address.
 24. A system according to claim 21,wherein the first sub-part is an Internet Protocol network address. 25.A system according to claim 18, wherein a second sub-part of a sequenceidentifier is a number.
 26. A system according to claim 25, wherein thenumber is randomly generated.
 27. A system according to claim 25,wherein the number is produced by applying a hash function to datadefining the subject matter of the sequence of messages.
 28. Anapparatus according to claim 20, and further comprising means forgenerating the meaningful part of the sequence identifier, if required.29. An apparatus according to claim 20, and further comprising checkingmeans for checking if the generated meaningless part of the sequenceidentifier has been previously generated; the identifier part generationmeans being further operable to generate another meaningless sequenceidentifier part if the checking means indicates that the generatedmeaningless part of the sequence identifier has been previouslygenerated; wherein the checking means and the identifier part generationmeans repeat their respective operations until a meaningless sequenceidentifier part is obtained which has not been previously generated. 30.An apparatus according to claim 18, and further comprising the step ofmeans for receiving a request for a sequence identifier; and thesequence identifier allocating means further comprising means fortransmitting the subsequently obtained sequence identifier to the partyor element from which the request was received.
 31. An announcementsystem for use in a publish-subscribe architecture, the systemcomprising: message transmission means for transmitting a sequence ofmessages relating to substantially the same subject matter on to one ormore communications channels, said means being operable to include ineach message at least part of a sequence identifier, the system beingcharacterised in that the sequence identifier having been allocated tothe sequence by an apparatus according to claim
 18. 32. An announcementsystem for use in a publish-subscribe architecture, the systemcomprising: message transmission means for transmitting an index messageonto an index channel, the index message containing one or more sequenceidentifiers respectively identifying one or more sequences of messages,each message in each sequence relating to substantially the same subjectmatter, the system being characterised in that the sequence identifiersare respectively allocated to the sequences of messages by an apparatusaccording to claim
 18. 33. An announcement system according to claim 31,and further comprising means for requesting the allocation of a sequenceidentifier from an apparatus.